Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Current and upcoming cosmological surveys will produce unprecedented amounts of high-dimensional data, which require complex high-fidelity forward simulations to accurately model both physical processes and systematic effects which describe the data generation process. However, validating whether our theoretical models accurately describe the observed datasets remains a fundamental challenge. An additional complexity to this task comes from choosing appropriate representations of the data which retain all the relevant cosmological information, while reducing the dimensionality of the original dataset. In this work we present a novel framework combining scale-dependent neural summary statistics with normalizing flows to detect model misspecification in cosmological simulations through Bayesian evidence estimation. By conditioning our neural network models for data compression and evidence estimation on the smoothing scale, we systematically identify where theoretical models break down in a data-driven manner. We demonstrate a first application of our approach using simulated total matter and gas density fields from three hydrodynamic simulation suites with different subgrid physics implementations.more » « less
-
Free, publicly-accessible full text available September 17, 2026
-
Free, publicly-accessible full text available September 1, 2026
-
ABSTRACT The dark matter (DM) distribution in dwarf galaxies provides crucial insights into both structure formation and the particle nature of DM. GraphNPE (Graph Neural Posterior Estimator), first introduced in Nguyen et al. (2023), is a novel simulation-based inference framework that combines graph neural networks and normalizing flows to infer the DM density profile from line-of-sight stellar velocities. Here, we apply GraphNPE to satellite dwarf galaxies in the FIRE-2 Latte simulation suite of Milky Way-mass haloes, testing it against both Cold and Self-Interacting DM scenarios. Our method demonstrates superior precision compared to conventional Jeans-based approaches, recovering DM density profiles to within the 95 per cent confidence level even in systems with as few as 30 tracers. Moreover, we present the first evaluation of mass modelling methods in constraining two key parameters from realistic simulations: the peak circular velocity, $$V_\mathrm{max}$$, and the peak virial mass, $$M_\mathrm{200m}^\mathrm{peak}$$. Using only line-of-sight velocities, GraphNPE can reliably recover both $$V_\mathrm{max}$$ and $$M_\mathrm{200m}^\mathrm{peak}$$ within our quoted uncertainties, including those experiencing tidal effects ($$\gtrsim 63~{{\rm per\ cent}}$$ of systems are recovered within our 68 per cent confidence intervals and $$\gtrsim 92~{{\rm per\ cent}}$$ within our 95 per cent confidence intervals). The method achieves $$10-20~{{\rm per\ cent}}$$ accuracy in $$V_\mathrm{max}$$ recovery, while $$M_\mathrm{200m}^\mathrm{peak}$$ is recovered to $$0.1-0.4 \, \mathrm{dex}$$ accuracy. This work establishes GraphNPE as a robust tool for inferring DM density profiles in dwarf galaxies, offering promising avenues for constraining DM models. The framework’s potential extends beyond this study, as it can be adapted to non-spherical and disequilibrium models, showcasing the broader utility of simulation-based inference and graph-based learning in astrophysics.more » « lessFree, publicly-accessible full text available July 9, 2026
-
Abstract Analyses of the cosmic 21-cm signal are hampered by astrophysical foregrounds that are far stronger than the signal itself. These foregrounds, typically confined to a wedge-shaped region in Fourier space, often necessitate the removal of a vast majority of modes, thereby degrading the quality of the data anisotropically. To address this challenge, we introduce a novel deep generative model based on stochastic interpolants to reconstruct the 21-cm data lost to wedge filtering. Our method leverages the non-Gaussian nature of the 21-cm signal to effectively map wedge-filtered 3D lightcones to samples from the conditional distribution of wedge-recovered lightcones. We demonstrate how our method is able to restore spatial information effectively, considering both varying cosmological initial conditions and astrophysics. Furthermore, we discuss a number of future avenues where this approach could be applied in analyses of the 21-cm signal, potentially offering new opportunities to improve our understanding of the Universe during the epochs of cosmic dawn and reionization.more » « less
-
Abstract A common setting in astronomy is the availability of a small number of high-quality observations, and larger amounts of either lower-quality observations or synthetic data from simplified models. Time-domain astrophysics is a canonical example of this imbalance, with the number of supernovae observed photometrically outpacing the number observed spectroscopically by multiple orders of magnitude. At the same time, no data-driven models exist to understand these photometric and spectroscopic observables in a common context. Contrastive learning objectives, which have grown in popularity for aligning distinct data modalities in a shared embedding space, provide a potential solution to extract information from these modalities. We present Maven, the first foundation model for supernova science. To construct Maven, we first pre-train our model to align photometry and spectroscopy from 0.5 M synthetic supernovae using a contrastive objective. We then fine-tune the model on 4702 observed supernovae from the Zwicky transient facility. Maven reaches state-of-the-art performance on both classification and redshift estimation, despite the embeddings not being explicitly optimized for these tasks. Through ablation studies, we show that pre-training with synthetic data improves overall performance. In the upcoming era of the Vera C. Rubin observatory, Maven will serve as a valuable tool for leveraging large, unlabeled and multimodal time-domain datasets.more » « less
-
We present PAPERCLIP (Proposal Abstracts Provide an Effective Representation for Contrastive Language-Image Pre-training), a method which associates astronomical observations imaged by telescopes with natural language using a neural network model. The model is fine-tuned from a pre-trained Contrastive Language–Image Pre-training (CLIP) model using successful observing proposal abstracts and corresponding downstream observations, with the abstracts optionally summarized via guided generation using large language models (LLMs). Using observations from the Hubble Space Telescope (HST) as an example, we show that the fine-tuned model embodies a meaningful joint representation between observations and natural language through quantitative evaluation as well as tests targeting image retrieval (i.e., finding the most relevant observations using natural language queries). and description retrieval (i.e., querying for astrophysical object classes and use cases most relevant to a given observation). Our study demonstrates the potential for using generalist foundation models rather than task-specifc models for interacting with astronomical data by leveraging text as an interface.more » « less
An official website of the United States government

Full Text Available